Viterbi decoder using restructured trellis

ABSTRACT

Viterbi decoding is implemented using an asymmetrical trellis  70  having an A-trellis  72  and a B-trellis  74 . The trellis  70  is designed for efficient implementation on a processing device  40  with arithmetic units  42  having multi-field arithmetic and logic capabilities By concurrently processing multiple path metrics in separate fields, a highly efficient decoder may be implemented in a software-controlled device.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] Not Applicable.

STATEMENT OF FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002] Not Applicable.

BACKGROUND OF THE INVENTION

[0003] 1. Technical Field

[0004] This invention relates in general to electronic communicationsand, more particularly, to error correction using a Viterbi decoder.

[0005] 2. Description of the Related Art

[0006] Many electronic devices use error correction techniques inconjunction with data transfers between components and/or data storage.Error correction is used in many situations, but is particularlyimportant for wireless data communications, where data can easily becorrupted between the transmitter and the receiver. In some cases,errant data is identified as such and retransmission is requested. Usingmore robust error correction schemes, however, errant data can bereconstructed without retransmission.

[0007] One popular error correction technique uses Viterbi decoding todetect errors in a data stream from a convolution encoder. A Viterbidecoder determines costs associated with multiple possible paths betweennodes. After a specified number of stages, the node with the minimumassociated cost is chosen, and a path is traced back through theprevious stages. The data is decoded data based on the selected path.

[0008] Actual implementations of Viterbi decoding use dedicatedhardware, because the decoding is computationally intensive. More andmore devices, however, are turning to DSPs (digital signal processors)to handle the computational chores. Additional circuitry dedicated toViterbi decoding on a DSP is undesirable, because it adds to the cost ofthe DSP and the power consumed by the DSP.

[0009] Accordingly, a need has arisen for a method and apparatus forperforming Viterbi decoding in software.

BRIEF SUMMARY OF THE INVENTION

[0010] The present invention performs a Viterbi decoding function bycalculating candidate path metrics for states at time T_(n) based onpreviously calculated path metrics for states at time T_(n−1) and branchmetrics associated with transitions between the states at time T_(n−1)and states at time T_(n) according to a first trellis, selecting pathmetrics for states at time T_(n) from the candidate path metrics andcalculating candidate path metrics for states at T_(n+1) based on theselected path metrics for states at T_(n) according to a second trellis,different from the first trellis.

[0011] Using an asymmetrical trellis structure can provide efficienciesthat allow a programmable processing device, such as a digital signalprocessor, to provide Viterbi decoding at high speeds.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

[0012] For a more complete understanding of the present invention, andthe advantages thereof, reference is now made to the followingdescriptions taken in conjunction with the accompanying drawings, inwhich:

[0013]FIG. 1 is a example of a data communication connection used in theprior art;

[0014]FIG. 2 is a block diagram of a conventional data encoder;

[0015]FIG. 3 is a state diagram of the encoder of FIG. 2;

[0016]FIG. 4 is a trellis diagram showing data transitions;

[0017]FIG. 5 is a trellis diagram showing the decoding of the data fromthe encoder of FIG. 2;

[0018]FIGS. 6a through 6 d are trellis diagrams showing the calculationof path metrics through the trellis diagram;

[0019]FIG. 7 is a block diagram of a programmable processing devicecapable of multi-field arithmetic and logic operations;

[0020]FIG. 8 is a prior art trellis diagram for a 16-state Viterbidecoder;

[0021]FIG. 9 is an asymmetrical trellis pair used in conjunction withthe processing device of FIG. 7;

[0022]FIGS. 10a through 10 d are partial trellis diagrams showingViterbi decoding for respective destination state groups;

[0023]FIG. 11 illustrates various registers used in the implementationof the Viterbi decoder;

[0024]FIG. 12 illustrates a flow chart describing implementation of thefirst trellis of the trellis pair;

[0025]FIG. 13 illustrates a flow chart describing implementation of thesecond trellis of the trellis pair; and

[0026]FIG. 14a through FIG. 14d illustrate partial trellis diagrams fora first destination state group the second trellis over respectivepasses.

DETAILED DESCRIPTION OF THE INVENTION

[0027] The present invention is best understood in relation to FIGS.1-14 of the drawings, like numerals being used for like elements of thevarious drawings.

[0028]FIG. 1 illustrates a general block diagram of communicationsbetween a data source and destination using convolutional encoding. Atthe source, k-bit data is received by a convolutional encoder 12. Theconvolutional encoder 12 generates an n-bit encoded data output based onthe received data. The encoded data is transmitted to the destinationthrough a transmission medium 14. During transmission, noise may beadded to the encoded data, thereby corrupting some of the output. At thedestination, the possibly corrupted data is received by Viterbi decoder16. The Viterbi decoder recovers the original data; even if the encodeddata is corrupted, the Viterbi decoder is able to recover the originaldata in many situations.

[0029] For illustration of convolutional encoding, an example using ak=1, n=2 structure is shown in FIG. 2. The encoder 12 receives the datato be encoded into a flip-flop 18 and two modulo-2 adders 20 and 22. Theoutput of flip-flop 18 is also received by an input of modulo-2 adder20. The output of flip-flop 18 is also coupled to the input of flip-flop24. The output of flip-flop 24 is coupled to an input of modulo-2 adder20 and an input of modulo-2 adder 22. The encoded output XY of theconvolution encoder 12 is the output of modulo-2 adder 20 (X) andmodulo-2 adder 22 (Y).

[0030] The convolutional encoder 12 has a constraint length (K) of 3,meaning that the current output is dependent upon the last three inputs.The dependency on previous values to affect the encoded data outputallows the Viterbi decoder to reconstruct the data despite transmissionerrors. Convolutional decoders are often classified as (n,k,K) encoders;hence the encoder shown in FIG. 2 would be a (2,1,3) encoder. Theconnection vectors, which define the connections between the shiftregister formed by flip-flops 18 and 24, for the encoder shown in FIG. 2are “111” for modulo-2 adder 20 and “101” for modulo-2 adder 22.

[0031] The “state” of the encoder 12 is defined as the outputs of theflip-flops 18 and 24. Thus the state of encoder 12 can be notated as“(output of FF 18, output of FF 24)”. A state diagram for the encoder ofFIG. 2 is shown in FIG. 3. Each of the four possible states (00, 01,10and 11) is shown within a circle. Transitions between states are shownresponsive to a data input of “0” (solid line) or a data input of “1”(dashed line). The two-bit value above the transition line is theresulting output XY. Thus, from a state of “00”, an input of “0” willresult in a return to “00” with an output of “00”. An input of 1 willresult in a transition to “10” and an output of “11”.

[0032] The state diagram of FIG. 3 shows the transitions from any stateat any given moment. In FIG. 4, a “trellis” diagram is used to shown thetransitions over time. From an arbitrary time, T₂, the trellis diagramof FIG. 4 shows the possible state transitions and outputs responsive toa given data input.

[0033]FIG. 5 shows an example of a path through the trellis using a datainput sequence of “1011” from an initial state of “00”. The initial datainput “1” causes a transition from state “00” to state “10” and anencoded output of “11”. The next data input, “0”, causes a transitionfrom state “10” to state “01” and an encoded output of “10”. Thefollowing data input, “1”, causes a transition from state “01” to “10”and an encoded output of “00”. The final data input, “1”, causes atransition from state “10” to state “11” and an encoded output of “01”.

[0034] The encoded output “11 10 00 01” will be transmitted to areceiving device with a Viterbi decoder. The two-bit encoded outputs areused to reconstruct the data. By convention, a data transmission beginsin state “00”. Hence, the first encoded output “11” would signify thatthe first input data bit was a “1” and the next state was “10”. Assumingno errors in transmission, the data input could be determined by statediagram of FIG. 2 or the trellis of FIG. 3.

[0035] However, in real-world conditions, the encoded data may becorrupted during transmission. In essence, the Viterbi decoder 16 tracesall possible paths, maintaining a “path metric” for each path, whichaccumulates differences (“branch metrics”) between the each of theencoded outputs actually received and the encoded outputs that would beexpected for that path. The path with the lowest path metric is themaximum likelihood path.

[0036]FIG. 6a illustrates computation of the branch metrics for thetransition from the initial state of “00”. In this case, an “11” wasreceived. With two-bit outputs, a “Hamming distance” may be used tocalculate the branch metric. The Hamming distance is the sum ofexclusive or operation on respective bits of the received output and theexpected output. For the path assuming a “0” input, the branch metricbetween the received encoded output (“11”) and the expected encodedoutput (“00”) is two. For the path assuming a “1” input, the branchmetric between the received encoded output (“11”) and the expectedencoded output (“00”) is zero. Hence the path metric at state “00” attime T₁ is two and the path metric at state “10” at time T₁ is zero. Thepath metrics are shown above the states in the diagram.

[0037]FIG. 6b illustrates the path through time T₂. In this example, itis assumed that there is a data transmission error, and the receivedencoded output is “11” rather than “10”. Hence, at T₂, the path metricis four for state “00”, one for state “01”, two for state “10” and onefor state “11”.

[0038]FIG. 6c illustrates the path through time T₃. At this point, twopotential paths are entering each state, For each state, the branchmetric is computed for each path entering the state, and the path withthe lowest path metric is chosen (the “surviving path”). If two pathshave the same path metric (such as state “01” at T₃), a path can bechosen randomly or deterministically (such as by always choosing theupper path).

[0039]FIG. 6d shows the path through time T₄. At this point, the actualpath through states “10 01 10 11” has the lowest path metric. If theexample sequence were longer, the path metrics for all other paths wouldincrease as the path metric for the actual path remained the same(assuming no additional errors). When the end of a path is reached, themost likely path is determined through a process called “traceback”.

[0040] As can be seen in FIGS. 6a-d, for each time period, a branchmetric calculation and path metric calculation must be performed foreach path entering a state. Further, a comparison must be performed todetermine the surviving state. For the example shown in FIGS. 2-6, thisis not terribly computation intensive. But for larger trellisstructures, for example a radix-4 trellis, the computations involved maynecessitate a dedicated hardware decoder, rather than a software Viterbidecoder.

[0041] The present invention is described in conjunction with a 16-stateViterbi decoder. The method of performing the decoding the encodedinformation using software uses a programmable processing device, suchas the C60 series of digital signal processors from TEXAS INSTRUMENTSINCORPORATED. A simplified block diagram showing the pertinent featuresof such a processing device is shown in FIG. 7.

[0042] In the preferred embodiment, the processing device 40 includesone or more arithmetic units 42 capable of multiple field arithmetic anda plurality of registers 44, typically arranged in a register file 46.The arithmetic units 42 have the ability to perform separate logical andarithmetic operations on predefined fields 48 within their inputregisters 50 under program control (represented by control logic 51).For example, if an arithmetic unit 42 uses 32-bit input registers, itcould perform four simultaneous compares between four respectiveeight-bit fields 48 within the input registers 50. The method describedherein takes advantage of the simultaneous operations in order toefficiently process information such that a software Viterbi decodingoperation can be performed at a suitable speed.

[0043]FIG. 8 illustrates a prior art sixteen state Viterbi decodingstage 60. The sixteen states are notated in hexadecimal format as states0-F. As shown in FIG. 5, this same stage is used between consecutivetime periods (T_(n), T_(n+1)) throughout the decoder. However, thecomputation involved in using this type of decoder stage is too complexto accommodate a typical data rate using a software programmable device.

[0044]FIG. 9 illustrates an asymmetrical Viterbi decoding stage pair 70that increases the efficiency of computation when used with a processingdevice of the type shown in FIG. 7. The decoding stage pair is“asymmetrical” because consecutive stages use different operations toperform the path metric calculations. For reference, the pair includes“A-trellis” 72 and “B-trellis” 74.

[0045]FIGS. 10a-d illustrate partial trellis diagrams showing how pathmetrics are concurrently calculated at four destination states. Forexample, referring to FIG. 10a, at T_(n+1), new path metrics for states0, 4, 8 and C are concurrently calculated. Similarly, at T_(n+2), newpath metrics for states 0, 1, 2, and 3 are concurrently calculated fromthe results of staes 0, 4, 8 and C at T_(n+1). The lines indicate whichfields are used for the calculations of the new path metrics; theparticular fields used could be different depending upon theimplementation.

[0046] Recapping the discussion of four-state Viterbi decoders, it canbe seen from FIG. 10a that the path metric for state 0 at T_(n+1) isequal to the lowest of the sum of the path metrics at states 0, 4, 8 andC at T_(n) added to the respective branch metric between those statesand state 0 at T_(n+1). To accurately describe the relationships, P(s,t)equals the path metric of state s at time t and B(s₁,s₂) is the branchmetric between state s₁ and s₂ based on the received encoded data.Hence:

[0047] P(0,T_(n+1)) equals min[P(0,T_(n))+B(0,0), P(4,T_(n)+B()4,0),P(8,T_(n))+B(8,0), P(C,T_(n))+B(C,0)]

[0048] P(4,T_(n+1)) equals min[P(1,T_(n))+B (1,4), P(5,T_(n))+B(5,4),P(9,T_(n))+B(9,4), P(D,T_(n))+B(D,4)]

[0049] P(8,T_(n+1)) equals min[P(2,T_(n))+B(2,8), P(6,T_(n))+B(6,8),P(A,T_(n))+B(A,8), P(E,T_(n))+B (E,8)]

[0050] P(C,T_(n+1)) equals min[P(3,T_(n))+B(3,C), P(7,T_(n))+B(7,C),P(B,T_(n))+B(C,C), P(F,T_(n))+B(F,C)]

[0051] The operations for FIG. 10a-d will be discussed with reference toFIGS. 11, 12 and 13. FIG. 11 shows an example of a register fileallocation. Four registers are used as temporary registers, TMP(0.3).These registers are used for intermediate calculations. Eight registersare used for storing path metrics. Each of these registers holds pathmetric values for four states in respective fields. For example, theregister storing CPM(048C) stores path metric at states 0, 4, 8 and C.Of these eight, four registers are used to store the path metriccalculated by the A-trellis 72 and four registers are used to store thepath metrics calculated by the B-trellis 74. The registers used to storethe path metrics for the A-trellis 72 are used as the most recentlycalculated path metrics for the calculations performed by the B-trellis74. Likewise, the registers used to store the path metrics for theB-trellis 74 are used as the most recently calculated path metrics forthe calculations performed by the A-trellis 72.

[0052] Additionally, a number of registers storing branch metrics forthe path metric calculations are provided Because the branch metricswill depend upon the encoding scheme, their calculation will not bespecified. No matter what method is used for calculating branch metrics,it should be possible to pre-calculate the branch metrics (as the datais received) for efficient calculation of the candidate path metrics.

[0053]FIG. 12 illustrates a flow chart describing the calculations usingthe A-trellis 72. It should be noted that this flowchart is meant todescribe the various calculations being made to implement the A-trellisand is not a detailed description of a particular order. Multiplearithmetic units, some steps can be performed concurrently for thegreatest time efficiency. For clarity, the line between states(representing the path metric) in the Figures illustrating the trellisesis depicted according to the field in the four-field word which storesthe intermediate result according to the following legend:

[0054] In block 80, variables q, r, and j are set to zero. For purposesof reference, the current time is T_(n+1). These variables are used asindices for various registers and states. In block 82, one input (IR1)is loaded with a path metric from the register file 46. On the firstpass, therefore, CMP(0123) (see FIG. 11) is loaded into IR1. Thisregister holds the path metrics computed for states 0, 4, 8 and C atT_(n).

[0055] In block 84, the other input is loaded with branch metrics basedon the current data value, the source state and the destination state.There are four branch metrics in four fields. For the first pass, thebranch metrics B(0,0), B(1,4), B(2,8) and B(3,C) are loaded. It isassumed that the branch metrics have been previously computed and packedinto four-field words and stored in the register file 46.

[0056] A multi-field addition is performed in step 86, where the firstfield from IR1 is added to the first field in IR2, the second field inIR1 is added to the second field in IR2, and so on. The result is storedin a temporary file TMP(j). Hence on the first pass, the result will bestored in TMP(0). Accordingly, at the end of the first pass (j=0),TMP(0) will store the candidate path metrics associated with atransition from state 0 (at T_(n)) to state 0 (at T_(n+1)), state 1 (atT_(n)) to state 4 (at T_(n+1)), state 2 (at T_(n)) to state 8 (atT_(n+1)), and state 3 (at T_(n)) to state C (at T_(n+1)).

[0057] On each pass (i.e., for each increment of j in blocks 88 and 90),a new set of source states and branch metrics are used to calculateadditional candidate path metrics. Thus, on the second pass (j=1),TMP(1) stores the candidate path metrics associated with a transitionfrom state 4 (at T_(n)) to state 0 (at T_(n+1)), state 5 (at T_(n)) tostate 4 (at T_(n+1)), state 6 (at T_(n)) to state 8 (at T_(n+1)), andstate 7 (at T_(n)) to state C (at T_(n+1)). On subsequent passes, TMP(2)stores the candidate path metrics associated with a transition fromstate 8 (at T_(n)) to state 0 (at T_(n+1)), state 9 (at T_(n)) to state4 (at T_(n+1)), state A (at T_(n)) to state 8 (at T_(n+1)), and state B(at T_(n)) to state C (at T_(n+1)). TMP(3) stores the candidate pathmetrics associated with a transition from state C (at T_(n)) to state 0(at T_(n+1)), state D (at T_(n)) to state 4 (at T_(n+1)), state E (atT_(n)) to state 8 (at T_(n+1)), and state F (at T_(n)) to state C (atT_(n+1)). TABLE 1 Content of TMP Registers After First Group State 0State 4 State 8 State C TMP(0) P(0,T_(n)) + P(1,T_(n)) + P(2,T_(n)) +P(3,T_(n)) + B(0,0) B(1,4) B(2,8) B(3,C) TMP(1) P(4,T_(n)) +P(5,T_(n)) + P(6,T_(n)) + P(7,T_(n)) + B(4,0) B(5,4) B(6,8) B(7,C)TMP(2) P(8,T_(n)) + P(9,T_(n)) + P(A,T_(n)) + P(B,T_(n)) + B(8,0) B(9,4)B(A,8) B(C,C) TMP(3) P(C,T_(n)) + P(D,T_(n)) + P(E,T_(n)) + P(F,T_(n)) +B(C,0) B(D,4) B(E,8) B(F,C)

[0058] For the first group of destination states (0,4,8,C), when j=3 inblock 88, the four temporary registers TMP(0..3) hold, in respectivefields, the candidate path metrics of the four possible transitions tostates 0, 4, 8 and C, as shown in Table 1. In block 92, the respectivefields of TMP(0) and TMP(1) are compared and the paths with the lowestpath metric for each field are selected and stored back in TMP(0). Inblock 94, the respective fields of TMP(2) and TMP(3) are compared andthe paths with the lowest path metric for each field are selected andstored in TMP(1). Finally, in block 96, TMP(0) and TMP(1) are comparedand the lowest path metric for each field is stored in the appropriateregister associate with the states being evaluated. For the A-Trellis ofFIG. 10a, this would be CMP(048C).

[0059] In blocks 98 and 100, the same flow as described above is used todetermine the lowest cost path for states associated with CMP(159D),CMP(26AE), and CMP(37BF). The contents of the TMP registers prior to thecomparisons are illustrated below in Tables 2-4. TABLE 2 Content of TMPRegisters After Second Group State 1 State 5 State 9 State D TMP(0)P(0,T_(n)) + P(1,T_(n)) + P(2,T_(n)) + P(3,T_(n)) + B(0,1) B(1,5) B(2,9)B(3,D) TMP(1) P(4,T_(n)) + P(5,T_(n)) + P(6,T_(n)) + P(7,T_(n)) + B(4,1)B(5,5) B(6,9) B(7,D) TMP(2) P(8,T_(n)) + P(9,T_(n)) + P(A,T_(n)) +P(B,T_(n)) + B(8,1) B(9,5) B(A,9) B(C,D) TMP(3) P(C,T_(n)) +P(D,T_(n)) + P(E,T_(n)) + P(F,T_(n)) + B(C,1) B(D,5) B(E,9) B(F,D)

[0060] TABLE 3 Content of TMP Registers After Third Group State 2 State6 State A State E TMP(0) P(0,T_(n)) + P(1,T_(n)) + P(2,T_(n)) +P(3,T_(n)) + B(0,2) B(1,6) B(2,A) B(3,E) TMP(1) P(4,T_(n)) +P(5,T_(n)) + P(6,T_(n)) + P(7,T_(n)) + B(4,2) B(5,6) B(6,A) B(7,E)TMP(2) P(8,T_(n)) + P(9,T_(n)) + P(A,T_(n)) + P(B,T_(n)) + B(8,2) B(9,6)B(A,A) B(C,E) TMP(3) P(C,T_(n)) + P(D,T_(n)) + P(E,T_(n)) + P(F,T_(n)) +B(C,2) B(D,6) B(E,A) B(F,E)

[0061] TABLE 4 Content of TMP Registers After Fourth Group State 3 State7 State B State F TMP(0) P(0,T_(n)) + P(1,T_(n)) + P(2,T_(n)) +P(3,T_(n)) + B(0,3) B(1,7) B(2,B) B(3,F) TMP(1) P(4,T_(n)) +P(5,T_(n)) + P(6,T_(n)) + P(7,T_(n)) + B(4,3) B(5,7) B(6,B) B(7,F)TMP(2) P(8,T_(n)) + P(9,T_(n)) + P(A,T_(n)) + P(B,T_(n)) + B(8,3) B(9,7)B(A,B) B(C,F) TMP(3) P(C,T_(n)) + P(D,T_(n)) + P(E,T_(n)) + P(F,T_(n)) +B(C,3) B(D,7) B(E,B) B(F,F)

[0062] The operation of the processing device 40 to implement theB-Trellis 74 is somewhat different. As shown in FIG. 10a, the firstgroup of destination registers for this trellis includes states 0, 1, 2and 3. The computation of the candidate path metrics for these groups isbased on a single set of four source states: states 0, 4, 8, and C (thedestination states for the preceding A-Trellis 72). The preferredimplementation of this trellis rotates the contents of the registerscomputed in the previous A-Trellis 72 to derive the four candidate pathmetrics for each destination state.

[0063] A flow chart describing the implementation of the B-Trellis 74 isgiven in FIG. 13. In FIGS. 14a-d, the four passes for the first group(destination states 0, 1, 2, 3) are separated for reference. The finalpath metrics for each destination state are:

[0064] P(0,T_(n+1)) equals min[P(0,T_(n))+B(0,0), P(4,T_(n))+B(4,0),P(8,T_(n))+B(8,0), P(C,T_(n))+B(C,0)]

[0065] P(1,T_(n+1)) equals min[P(4,T_(n))+B(4,1), P(8,T_(n))+B(8,1),P(C,T_(n))+B(C,1), P(0,T_(n))+B(0,1)]

[0066] P(2,T_(n+1)) equals min[P(8,T_(n))+B(8,2), P(C,T_(n))+B(C,2),P(0,T_(n))+B(0,2), P(4,T_(n))+B(4,2)]

[0067] P(3,T_(n+1)) equals min[P(C,T_(n))+B(C,3), P(0,T_(n))+B(0,3),P(4,T_(n))+B(4,3), P(8,T_(n))+B(8,3)]

[0068] In block 110, the indices are initialized in block 112, IR1 isloaded with a register 44 storing a set of previously calculated pathmetrics. For the first group of destination states (shown in FIGS.14a-d), CMP(048C) is loaded into IR1. For the second, third and fourthgroups of destination states, CMP(159D), CMP(26AE) and CMP(37BF),respectively, will be loaded.

[0069] In block 114, the appropriate branch metrics are loaded intorespective fields of input register IR2. On the first pass (C=0), thefour fields are set to B(r,q)|B(r+4,q+1)|B(r+8,q+2)|B(r+C,q+3). On thesecond pass, the four fields are set toB(r+4,q)|B(r+8,q+1)|B(r+C,q+2)|B(r,q+3). On the third pass, the fourfields are set to B(r+8,q)|B(r+C,q+1)|B(r,q+2)|B(r+4,q+3). On the fourthpass, the four fields are set toB(r+C,q)|B(r,q+1)|B(r+4,q+2)|B(r+8,q+3).

[0070] In block 116, the multi-field addition is performed, renderingone candidate path metric for each of the four destination states, whichis stored in a TMP register. In block 118 and 120, fields in IR1 arerotated on each pass. The reason for rotating the fields is that each ofthe input states are used in a candidate path metric calculation overfour passes. Thus, for example, source state 0 (T_(n)) is used for apath metric calculation for destination state 0 in the first pass, fordestination state 1 in the second pass, destination state 2 in the thirdpass and destination state 3 in the fourth pass. The rotation aligns thesource state with the proper field for the multi-field add operation(see Tables 5-8, below). TABLE 5 Content of TMP Registers After FirstGroup State 0 State 1 State 2 State 3 TMP(0) P(0,T_(n)) + P(4,T_(n)) +P(8,T_(n)) + P(C,T_(n)) + B(0,0) B(4,1) B(8,2) B(C,3) TMP(1)P(4,T_(n)) + P(8,T_(n)) + P(C,T_(n)) + P(0,T_(n)) + B(4,0) B(8,1) B(C,2)B(0,3) TMP(2) P(8,T_(n)) + P(C,T_(n)) + P(0,T_(n)) + P(4,T_(n)) + B(8,0)B(C,1) B(0,2) B(4,3) TMP(3) P(C,T_(n)) + P(0,T_(n)) + P(4,T_(n)) +P(8,T_(n)) + B(C,0) B(0,1) B(4,2) B(8,3)

[0071] TABLE 6 Content of TMP Registers After Second Group State 4 State5 State 6 State 7 TMP(0) P(1,T_(n)) + P(5,T_(n)) + P(9,T_(n)) +P(D,T_(n)) + B(1,4) B(5,5) B(9,6) B(D,7) TMP(1) P(5,T_(n)) +P(9,T_(n)) + P(D,T_(n)) + P(1,T_(n)) + B(5,4) B(9,5) B(D,6) B(1,7)TMP(2) P(9,T_(n)) + P(D,T_(n)) + P(1,T_(n)) + P(5,T_(n)) + B(9,4) B(D,5)B(1,6) B(5,7) TMP(3) P(D,T_(n)) + P(1,T_(n)) + P(5,T_(n)) + P(9,T_(n)) +B(D,4) B(1,5) B(5,6) B(9,7)

[0072] TABLE 7 Content of TMP Registers After Third Group State 8 State9 State A State B TMP(0) P(2,T_(n)) + P(6,T_(n)) + P(A,T_(n)) +P(E,T_(n)) + B(2,8) B(6,8) B(A,A) B(E,B) TMP(1) P(6,T_(n)) +P(A,T_(n)) + P(E,T_(n)) + P(2,T_(n)) + B(6,8) B(A,8) B(E,A) B(2,B)TMP(2) P(A,T_(n)) + P(E,T_(n)) + P(2,T_(n)) + P(6,T_(n)) + B(A,8) B(E,8)B(2,A) B(6,B) TMP(3) P(E,T_(n)) + P(2,T_(n)) + P(6,T_(n)) + P(A,T_(n)) +B(E,8) B(2,8) B(6,A) B(A,B)

[0073] TABLE 8 Content of TMP Registers After Fourth Group State C StateD State E State F TMP(0) P(3,T_(n)) + P(7,T_(n)) + P(B,T_(n)) +P(F,T_(n)) + B(3,C) B(7,D) B(B,E) B(F,F) TMP(1) P(7,T_(n)) +P(B,T_(n)) + P(F,T_(n)) + P(3,T_(n)) + B(7,C) B(B,D) B(F,E) B(3,F)TMP(2) P(B,T_(n)) + P(F,T_(n)) + P(3,T_(n)) + P(7,T_(n)) + B(B,C) B(F,D)B(3,E) B(7,F) TMP(3) P(F,T_(n)) + P(3,T_(n)) + P(7,T_(n)) + P(B,T_(n)) +B(F,C) B(3,D) B(7,E) B(B,F)

[0074] After the four passes are complete, the compare operations ofblocks 122, 124 and 126 are implemented, similar to those shown inblocks 92-96 of FIG. 12. In block 122, the respective fields of TMP(0)and TMP(1) are compared and the paths with the lowest path metric foreach field are selected and stored back in TMP(0). In block 124, therespective fields of TMP(2) and TMP(3) are compared and the paths withthe lowest path metric for each field are selected and stored in TMP(1).Finally, in block 126, TMP(0) and TMP(1) are compared and the lowestpath metric for each field is stored in the appropriate registerassociate with the states being evaluated. For the A-Trellis of FIG.10a, this would be CMP(0123).

[0075] After the selection of the lowest path metric for each of thedestination states in the group, the next group is selected by blocks128 and 130, until all the path metrics are complete. The path metricscomputed by the B-trellis are used by the next A-trellis for computationof the next set of path metrics.

[0076] The present invention provides significant advantages over theprior art. By rearranging the trellises, multi-field additions andcomparisons can be used to greatly speed the computations of the Viterbidecoder, thereby allowing a software decoder to be implemented.

[0077] Although the Detailed Description of the invention has beendirected to certain exemplary embodiments, various modifications ofthese embodiments, as well as alternative embodiments, will be suggestedto those skilled in the art. The invention encompasses any modificationsor alternative embodiments that fall within the scope of the Claims.

1. A method of performing Viterbi decoding function comprising the stepsof: calculating candidate path metrics for states at time T_(n) based onpreviously calculated path metrics for states at time T_(n−1) and branchmetrics associated with transitions between said states at time T_(n−1)and states at time T_(n) according to a first trellis; selecting pathmetrics for states at time T_(n) from said candidate path metrics;calculating candidate path metrics for states at T_(n+1) based on saidselected path metrics for states at T_(n) according to a second trellis,different from said first trellis.
 2. The method of claim 1 wherein saidstep of calculating candidate path metrics according to a first trelliscomprises the step of simultaneously calculating path metrics for agroup of states at T_(n).
 3. The method of claim 2 and furthercomprising the step of repeating said step of calculating path metricsfor a group of states at T_(n) until path metric candidates for allstates at T_(n) are generated.
 4. The method of claim 2 wherein saidstep of simultaneously calculating path metrics for a group comprisesthe steps of: for each of j sets of states at T_(n−1), loading fieldsassociated with a first operand of a processing device with respectivepath metrics of the set, loading a second operand of said processingdevice with corresponding branch metrics, adding said first and secondoperands to generate a result providing candidate path metrics for saidgroup of states at T_(n) in respective fields of the result.
 5. Themethod of claim 4 and further comprising the step of storing the resultfor each of said j sets in respective registers.
 6. The method of claim5 wherein said selecting step comprises the step of comparing respectivefields of said registers to determine a smallest path metric for eachstate of said group.
 7. The method of claim 6 and further comprising thestep of updating a traceback array.
 8. The method of claim 1 whereinsaid step of calculating candidate path metrics according to a secondtrellis comprises the step of simultaneously calculating path metricsfor a group of states at T_(n+1).
 9. The method of claim 8 and furthercomprising the step of repeating said step of calculating path metricsfor a group of states at T_(n) until path metric candidates for allstates at T_(n+1) are generated.
 10. The method of claim 8 wherein saidstep of simultaneously calculating path metrics for a group of states atT_(n+1) comprises the steps of: loading fields associated with a firstoperand of a processing device with respective path metrics of a set ofstates at T_(n), loading respective fields of a second operand of saidprocessing device with corresponding branch metrics, adding said firstand second operands to generate a result providing candidate pathmetrics for said group of states at T_(n+1) in respective fields of theresult.
 11. The method of claim 10 and further comprising the step ofgenerating additional candidate path metrics for said group of states atT_(n+1) by rotating the fields in said first operand, loading respectivefields of the second operand with corresponding state metrics and addingsaid first and second operands.
 12. A Viterbi decoder comprising:programmable processing circuitry for: calculating candidate pathmetrics for states at time T_(n) based on previously calculated pathmetrics for states at time T_(n−1) and branch metrics associated withtransitions between said states at time T_(n−1) and states at time T_(n)according to a first trellis; selecting path metrics for states at timeT_(n) from said candidate path metrics; calculating candidate pathmetrics for states at T_(n+1) based on said selected path metrics forstates at T_(n) according to a second trellis, different from said firsttrellis.
 13. The Viterbi decoder of claim 12 wherein said programmableprocessing circuitry calculates candidate path metrics according to afirst trellis by simultaneously calculating path metrics for a group ofstates at T_(n).
 14. The Viterbi decoder of claim 13 wherein saidprogrammable processing circuitry repeats the calculation of pathmetrics for a group of states at T_(n) until path metric candidates forall states at T_(n) are generated.
 15. The Viterbi decoder of claim 13wherein said programmable processing circuitry includes a arithmeticunit operable to perform multiple simultaneous logic operations onrespective fields of first and second operands.
 16. The Viterbi decoderof claim 15 wherein path metrics for a group are calculated by: for eachof j sets of states at T_(n−1), loading fields associated with the firstoperand with respective path metrics of the set, loading the secondoperand of with corresponding branch metrics, adding said first andsecond operands to generate a result providing candidate path metricsfor said group of states at T_(n) in respective fields of the result.17. The Viterbi decoder of claim 16 wherein said programmable processingcircuitry includes respective registers for storing the result for eachof said j sets.
 18. The Viterbi decoder of claim 17 wherein programmableprocessing circuitry selects path metrics by comparing respective fieldsof said registers to determine a smallest path metric for each state ofsaid group.
 19. The Viterbi decoder of claim 18 wherein saidprogrammable processing circuitry stores said smallest path metrics in atraceback array.
 20. The Viterbi decoder of claim 12 wherein saidprogrammable processing circuitry calculates candidate path metricsaccording to a second trellis by simultaneously calculating path metricsfor a group of states at T_(n+1).
 21. The Viterbi decoder of claim 20wherein said programmable processing circuitry repeats calculating pathmetrics for a group of states at T_(n) until path metric candidates forall states at T_(n+1) are generated.
 22. The Viterbi decoder of claim 21wherein said programmable processing circuitry includes a arithmeticunit operable to perform multiple simultaneous logic operations onrespective fields of first and second operands.
 23. The Viterbi decoderof claim 20 wherein said programmable processing circuitry calculatespath metrics for a group of states at T_(n+1) by loading fieldsassociated with the first operand of a processing device with respectivepath metrics of a set of states at T_(n), loading respective fields ofthe second operand of said processing device with corresponding branchmetrics, and adding said first and second operands to generate a resultproviding candidate path metrics for said group of states at T_(n+1) inrespective fields of the result.
 24. The Viterbi decoder of claim 23wherein said programmable processing device generates additionalcandidate path metrics for said group of states at T_(n+1) by rotatingthe fields in said first operand, loading respective fields of thesecond operand with corresponding state metrics and adding said firstand second operands.